Stochastic Comparison of Discounted Rewards
نویسندگان
چکیده
منابع مشابه
Discounted Stochastic Games
We give an alternative proof to a result of Mertens and Parthasarathy, stating that every n-player discounted stochastic game with general setup, and with a norm-continuous transition, has a subgame perfect equilibrium. † Institute of Mathematics and Center for Rationality and Interactive Decision Theory, The Hebrew University, Givat Ram, 91904 Jerusalem, Israel. e-mail: [email protected] ...
متن کاملPolicy Iteration Algorithms for DEC-POMDPs with Discounted Rewards
Over the past seven years, researchers have been trying to find algorithms for the decentralized control of multiple agent under uncertainty. Unfortunately, most of the standard methods are unable to scale to real-world-size domains. In this paper, we come up with promising new theoretical insights to build scalable algorithms with provable error bounds. In the light of the new theoretical insi...
متن کاملConstrained Markov Decision Models with Weighted Discounted Rewards
This paper deals with constrained optimization of Markov Decision Processes. Both objective function and constraints are sums of standard discounted rewards, but each with a diierent discount factor. Such models arise, e.g. in production and in applications involving multiple time scales. We prove that if a feasible policy exists, then there exists an optimal policy which is (i) stationary (non...
متن کاملContinuous Time Markov Decision Processes with Expected Discounted Total Rewards
Abstract. This paper discusses continuous time Markov decision processes with criterion of expected discounted total rewards, where the state space is countable, the reward rate function is extended real-valued and the discount rate is a real number. Under necessary conditions that the model is well defined, the state space is partitioned into three subsets, on which the optimal value function ...
متن کاملComputing Stackelberg Equilibria in Discounted Stochastic Games
Stackelberg games increasingly influence security policies deployed in real-world settings. Much of the work to date focuses on devising a fixed randomized strategy for the defender, accounting for an attacker who optimally responds to it. In practice, defense policies are often subject to constraints and vary over time, allowing an attacker to infer characteristics of future policies based on ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Applied Probability
سال: 2011
ISSN: 0021-9002,1475-6072
DOI: 10.1239/jap/1300198151